EAT DELIVERY, a company specialising in food order delivery services, would like to improve their delivery order assignment process. Therefore, they are planning to develop an automated system to optimize the delivery order assignment and route schedule for their riders. For this hackathon, we need to support the company to design the system.
The Hackathon challenges can be referred to two datasets provided: riders.csv and orders.csv.
All the orders from customers in Singapore are presented in the data file, orders.csv. A snapshot for the file is presented as below.
orders <- read.csv("orders.csv",row.names=1)
head(orders)
## Postal_Code Longitude Latitude
## 1 100082 103.8079 1.2739
## 2 100108 103.8103 1.2810
## 3 101103 103.8134 1.2807
## 4 109344 103.8048 1.2805
## 5 109691 103.8030 1.2822
## 6 109918 103.8112 1.2682
#The size of the dataset
dim(orders)
## [1] 2965 3
The distribution of the orders is shown in Red on the map.
The data file, riders.csv, includes location information of riders who are available to take up new deliveries. A snapshot is presented as below.
riders <- read.csv("riders.csv",row.names=1)
head(riders)
## Postal_Code Longitude Latitude
## 5001 118109 103.7907 1.2795
## 5002 120117 103.7712 1.3245
## 5003 129168 103.7564 1.3229
## 5004 188971 103.8536 1.2992
## 5005 189862 103.8585 1.3011
## 5006 199572 103.8617 1.3016
#The size of the dataset
dim(riders)
## [1] 100 3
The distribution of the riders is visualized in blue on the map.
To simplify the development, we remove the less important information, and could just use the simple plot and points functions to present the spatial data. You may study the details in Help.
plot(orders[,2:3], col="red")
points(riders[,2:3], col="blue")
Your goal is to come up with an automated system providing EAT DELIVERY with the case assignments for each rider. The automated case assignment system will address the challenges as below:
The case is provided by a local anonymous company and is further revised to fit for the purpose of this hackathon. To reduce the complexity of of the problem, we have made several assumptions.
Each rider is at his/her current location, where the restaurants are located and have the unlimited food supply to fulfil all customer orders. Therefore, you do not have to consider the case of the riders having to visit the restaurants to pick up food once the orders are assigned. After the riders have delivered all orders assigned, they do not need to return to the original starting point.
The actual travel paths along roads are not considered. The distance of the path between any two location points is only calculated by Havensine Distance Function presented as below.
Havensine Distance Function (in meter)
Havensine <-function(slong,slat,long,lat){
a=6378137*acos(
cos((pi/180)*lat)*cos((pi/180)*long)*cos((pi/180)*slat)*cos((pi/180)*slong)
+ cos((pi/180)*lat)*sin((pi/180)*long)*cos((pi/180)*slat)*sin((pi/180)*slong)+
sin((pi/180)*lat)*sin((pi/180)*slat));
round(a,3)
}
#Test the function
Havensine(103.7911,1.2797,103.7712,1.3243)
## [1] 5436.411
Alternatively, the geosphere::distHaversine() function may be used to calculate the Havensine distance. Make sure that your package has been installed by executing install.packages(“geosphere”) before it is used.
library("geosphere")
distHaversine(c(103.7911,1.2797),c(103.7712,1.3243))
## [1] 5436.411
Each order takes 1.5 minutes to reach the customer when the rider arrives at the customer’s building. The riders’ travelling speed is 48km/hr between two locations. Traffic lights and traffic congestion are not considered. The company’s promise to the customers is to deliver the orders within one hour.
You may make more assumptions for the system if it is needed.
There are a number of methods to solve the problems. The following demonstration illustrates one of the methods to find the solution with the purpose for you to understand the challenges better. There should be the other better solutions for you to explore.
We may develop or apply a method to divide the orders dataset into different subsets assigned to the dedicated riders.
For example, we may use some clustering methods such as k-means which produces the clustering result as below.
plot(orderPoints, col=groupIndices+1)
points(riders[,2:3], col=1:100,pch = ".")
The Rider 5068 is taken as an example. The orders assigned to him/her may be like in this way.
ci = 68; # cluster index is 68 with respect to rider 5068
(clusterMembers=orderPoints[groupIndices==ci,])
## Longitude Latitude
## 1849 103.8604 1.3648
## 1850 103.8612 1.3657
## 1851 103.8611 1.3657
## 1852 103.8608 1.3658
## 1858 103.8658 1.3624
## 1859 103.8659 1.3620
## 1865 103.8674 1.3643
## 1867 103.8622 1.3675
## 1868 103.8624 1.3680
## 1869 103.8629 1.3687
## 1870 103.8629 1.3689
## 1871 103.8632 1.3693
## 1887 103.8673 1.3633
## 1888 103.8656 1.3634
## 1889 103.8623 1.3657
## 1896 103.8646 1.3661
## 1897 103.8633 1.3671
## 1899 103.8654 1.3632
## 1901 103.8658 1.3650
## 1912 103.8661 1.3653
## 1913 103.8663 1.3666
## 1915 103.8672 1.3698
## 1916 103.8646 1.3682
## 1917 103.8646 1.3694
## 1918 103.8642 1.3687
## 1921 103.8642 1.3655
## 1922 103.8633 1.3658
## 1923 103.8647 1.3620
## 1924 103.8620 1.3645
## 1925 103.8655 1.3663
## 1926 103.8638 1.3683
## 1932 103.8671 1.3629
## 1936 103.8661 1.3631
## 1952 103.8600 1.3625
## 1956 103.8653 1.3661
## 1957 103.8649 1.3661
## 1958 103.8643 1.3673
## 1959 103.8634 1.3683
## 1962 103.8620 1.3626
## 1963 103.8620 1.3634
## 1964 103.8616 1.3634
## 1969 103.8682 1.3642
## 1974 103.8567 1.3635
## 1978 103.8577 1.3716
## 2009 103.8621 1.3706
We visualize the assignment on the map by the basic plot and points functions.
plot(clusterMembers, col="red") # The location of delivery orders assigned to the rider
points(ridersPoints["5068",], col="blue") # the location of the Rider 5068
Alternatively, the cluster is visualized on the map.
Is it easy for us to recommend the tour for this single case? How can we design a solution to recommend the optimal tour for Rider 5068 to reach all the order points with the least effort?
We may refer to TSP package to provide the solution as below by applying Nearest Neighbor method.
## [1] "5068" "1916" "1918" "1926" "1959" "1869" "1870" "1871" "1917" "1958"
## [11] "1897" "1867" "1868" "1889" "1922" "1921" "1896" "1957" "1956" "1925"
## [21] "1913" "1912" "1901" "1888" "1899" "1936" "1858" "1859" "1923" "1932"
## [31] "1887" "1865" "1969" "1915" "2009" "1978" "1852" "1851" "1850" "1849"
## [41] "1924" "1963" "1964" "1962" "1952" "1974"
The route can be visualized on the graph.
plot(resultPath, col="red")
s <- seq(nrow(resultPath)-1) # one shorter than data
x=resultPath[,1]
y=resultPath[,2]
arrows(x[s], y[s], x[s+1], y[s+1], col = "red",length = 0.1)
points(ridersPoints["5068",], col="blue")
The route is visualized on the map in this way.
However, there should be the other better method to solve this. You may use TSP or find the other package and configure the settings to provide the better solution. Or more credits will be given if you develop and use your own new algorithms.
We perform the similar calculation for the rest riders. A table for the solution the Delivery Orders Assignment and Routing Information for all Riders is generated.
Once you have the results in the data table, You need to generate two files. Providing that you define resultTable as the data.frame variable for the result, you may use the R functions below to generate the files which will be submitted.
#save(resultTable,file="resultTable.Rda") # remove "#"" when the function is needed
#write.csv(resultTable,file="resultTable.csv",row.names = FALSE)
The details of the structure of the solution data format are prestented as below.
dim(resultTable)
## [1] 100 5
str(resultTable)
## 'data.frame': 100 obs. of 5 variables:
## $ RiderID : Factor w/ 100 levels "5001","5002",..: 1 2 3 4 5 6 7 8 9 10 ...
## $ OrderSetID : chr "49,20,21,47,48,53,52,22,23,17,16,46,19,38,39,40,41,11,32,8,18,12,33,34,15,25,24,28,26,27,29,42,31" "2227,2228,2229,2182,2177,2178,2174,2179,2175,2183,2184,55,2181,2180,2176,2157,98,97,56" "96,101,100,99,81,63,80,78,65,73,74,77,76,70,69,57,79,88,62,87,86,85,91,93,83,94,92,60,61,84,90,89,95,75,72,2245" "170,162,216,238,239,240,241,205,204,249,203,222,225,224,248,247,242,243,255,256,282,283,268,269,201,202,212,245"| __truncated__ ...
## $ Total_Distance: num 7530 5682 10255 6133 4888 ...
## $ No_Of_Orders : int 33 19 36 45 37 3 45 25 20 27 ...
## $ Total_time : num 58.9 35.6 66.8 75.2 61.6 ...
A snapshot of some riders’ solutions is demonstrated as below.
kable(resultTable[c(6,45,89,100),],row.names = FALSE)
| RiderID | OrderSetID | Total_Distance | No_Of_Orders | Total_time |
|---|---|---|---|---|
| 5006 | 859,861,1118 | 3047.146 | 3 | 8.308933 |
| 5045 | 961,960,894,882,883,1112,886,895 | 1693.842 | 8 | 14.117303 |
| 5089 | 2534,2533,2532 | 9573.965 | 3 | 16.467457 |
| 5100 | 2958,2957,2961,2960,2964,2959 | 4867.267 | 6 | 15.084084 |
On the basis of the results in the table, we can calculate how many late delivery orders (i.e. more than one hour) to the customers. A example of the code below is used to find how many riders who could not complete the orders on time.
sum(resultTable$Total_time>60)
## [1] 36
You need to take close looks into the order assignments and routes to make further improvement in order to minimize the number of late delivery orders and delivery time to the customers.
In addition to the four criteria stated in Section 3, you may suggest additional measurement criteria to evaluate your approach and outputs. Extra credits will be given if the team can provide better solution by changing or making some assumptions to make the solution more realistic and practical.
Whilst R is encouraged, you may use your own programming language. Please make sure that your programme can at least produce a csv file with the required format presented in Table 3.
The evaluation criteria also include the quality of your code, report and your possible presentation on the Finals. Each should be clear and concise.
You are encouraged to submit the solution by solving the problem as much as you can. You need to submit a report and your codes. The Rda and csv files with the required solution format are needed to be submitted. You can define your own report format. The report should not be more than 8 pages. Please submit all codes, any files generated, and a report to sbiz_anl@suss.edu.sg . The timeline is as below.